Word | Frequency | Number of right neighbors | Number of left neighbors | Ratio |
---|---|---|---|---|
s | 12233 | 538 | 1 | 538.0000 |
t | 4968 | 225 | 1 | 225.0000 |
re | 1165 | 71 | 1 | 71.0000 |
He | 3318 | 154 | 3 | 51.3333 |
m | 800 | 50 | 1 | 50.0000 |
ve | 868 | 46 | 1 | 46.0000 |
ll | 672 | 43 | 1 | 43.0000 |
The | 11807 | 736 | 18 | 40.8889 |
They | 1797 | 89 | 3 | 29.6667 |
In | 2441 | 96 | 4 | 24.0000 |
This | 1725 | 69 | 3 | 23.0000 |
But | 2784 | 64 | 3 | 21.3333 |
She | 1118 | 63 | 3 | 21.0000 |
And | 1955 | 59 | 3 | 19.6667 |
We | 2346 | 96 | 5 | 19.2000 |
Its | 465 | 34 | 2 | 17.0000 |
York | 399 | 17 | 1 | 17.0000 |
It | 3665 | 81 | 5 | 16.2000 |
Then | 346 | 16 | 1 | 16.0000 |
Even | 360 | 15 | 1 | 15.0000 |
Word | Frequency | Number of right neighbors | Number of left neighbors | Ratio |
---|---|---|---|---|
p.m | 911 | 1 | 41 | 0.0244 |
don | 1241 | 1 | 27 | 0.0370 |
it. | 967 | 2 | 53 | 0.0377 |
didn | 639 | 1 | 25 | 0.0400 |
a.m | 414 | 1 | 21 | 0.0476 |
amount | 227 | 1 | 14 | 0.0714 |
able | 592 | 1 | 13 | 0.0769 |
wasn | 327 | 1 | 13 | 0.0769 |
couldn | 181 | 1 | 13 | 0.0769 |
doesn | 519 | 1 | 13 | 0.0769 |
isn | 358 | 1 | 12 | 0.0833 |
kind | 462 | 2 | 23 | 0.0870 |
ability | 221 | 1 | 11 | 0.0909 |
type | 177 | 1 | 11 | 0.0909 |
system. | 133 | 1 | 11 | 0.0909 |
up. | 266 | 2 | 21 | 0.0952 |
off. | 134 | 1 | 10 | 0.1000 |
lack | 202 | 1 | 10 | 0.1000 |
U.S | 626 | 1 | 10 | 0.1000 |
wouldn | 171 | 1 | 10 | 0.1000 |
In this subsection, we compute the ratio of the number of right neighbors and the number of left neighbors. Again, we look for words with extreme ratios:
Data for first table:
select word,w.freq,aa.cnt, bb.cnt,aa.cnt/bb.cnt as r from words w, (select w1_id,count(c.w2_id) as cnt from co_n c where w1_id>100 group by w1_id) aa, (select w2_id,count(c.w1_id) as cnt from co_n c where w2_id>100 group by w2_id) bb where w_id=aa.w1_id and aa.w1_id=bb.w2_id order by r desc limit 20;
Diagram data:
select aa.cnt, bb.cnt from (select w1_id,count(c.w2_id) as cnt from co_n c where w1_id>100 group by w1_id) aa, (select w2_id,count(c.w1_id) as cnt from co_n c where w2_id>100 group by w2_id) bb where aa.w1_id=bb.w2_id;
5.1.7.1 Number of NN co-occurrences vs. Frequency I
5.1.7.2 Number of NN co-occurrences vs. Frequency II